Image Classification

1. Classify grayscale images(28 × 28 pixels) into their 10 categories (0 through 9)

Problem we are solving

- We are provided 60k images(MNIST dataset). 1 image is 28x28 matrix. Each element can have value between (0,255)
- We need to create a ML model, which is trained using these 60k images and will predict any input image (0 to 9)

How a image is represented by 28x28 matrix?

In a grayscale image represented as a (28, 28) matrix, each element of the matrix corresponds to the intensity of the pixel at that position.
0 represents black: A pixel with an intensity value of 0 is completely black.
255 represents white: A pixel with an intensity value of 255 is completely white.
Values in between represent shades of gray: The values between 0 and 255 represent varying shades of gray, with higher values indicating lighter shades

1. Create Neural Network of 2 dense layers

a. Create n linear layers(functions) using keras, where input & output to layer is a tensor.
b. Add 2 Dense layers with (512=neurons, ReLU=activation function, i/p tensor shape=(28*28) matrix
c. Add loss function(categorical_crossentropy), Optimizer, Metrics

        @startuml
        skinparam componentStyle rectangle
        node "Reshape" {
          [Divide by 255\n\n each value b/w 0,1] as shaper
        }
        node "Neural Network" {
          [Dense_Layer_1\n\nneuron=512\nActivation=Relu] as dl1
          [Dense_Layer_2\n\nneuron=10\nActivation=softmax] as dl2
        }
        interface "test_images = 60000 28x28 matrices \n\n (60000, 28 * 28) \n\n each value b/w[0,255]" as input
        interface "Probability b/w 0-1 of digit" as p
        
        input ..> shaper
        shaper ..> dl1 : "test_images = (60000, 28*28)\n each value b/w 0,1"
        dl1 ..> dl2 : 2d tensor
        dl2 ..> dl1 : Loss Function=\ncategorical_crossentropy
        dl2 ..> p
        @enduml


from keras import models    #Keras is a library(having functions). We are using 1 class from keras having functions.
from keras import layers
from keras.datasets import mnist

network = models.Sequential()       //a
network.add(layers.Dense(512, activation='relu', input_shape=(28*28,))) //b
network.add(layers.Dense(10, activation='softmax'))
network.compile(optimizer='rmsprop', loss='categorical_crossentropy',metrics=['accuracy'])  //c

2. Prepare input data for neural network


train_images = (60000, 28(row),28(col))         //a. Load input image 60k matrix of 28x28
train_images = reshape(60000, 28x28(row),1(col))//b. Convert 28x28 to 784x1
train_images = train_images.astype ('float32') / 255    //c. Round off every value in matrix between 0 and 1

a. Load MNIST dataset(60,000 training images, plus 10,000 test images) to local variables

train_images? Array containing 60,000 elements(28x28 matrices)
train_labels? Array of 60000 labels each label is an integer from 0 to 9, representing the digit that the corresponds to image.


(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

b. Reshape train_images,test_images from 28(row)x28(col) to 28*28=784(rows)x1(col)

Reshaping means rearranging tensor to to fit input size needed by neural network layer.
Why Reshaping? Our Neural Network expects 1-D array


# Examples
>>> x = np.array ( [[1,2],[3,4],[5,6]])
>>> print(x.shape)
    (3,2)       //3:Rows, 2:cols
>>> x = x.reshape(2,3)
>>> x
array([[1,2,3],
       [4,5,6]])

train_images = train_images.reshape ((60000, 28*28))
test_images = test_images.reshape ((10000, 28*28))

c. Convert all train_images,test_images from [0,255] to [0,1]


# Divide every value by 255.
0/255 = 0
148/255 = 0.5803
...
1/255 = 0.003921

train_images = train_images.astype ('float32') / 255
test_images = test_images.astype ('float32') / 255

d. Convert train_images,test_images to One-hot encoded labels


train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

3. Feed data to neural network and create a tranining loop

Feed data in batches of 128. ie out of 60k 2D arrays, only feed 128 at a times
Network will compute gradients of the weights with regard to the loss on the batch, and update the weights


network.fit(train_images, train_labels, epochs=5, batch_size=128)

Complete Code


from keras.datasets import mnist
from keras import models
from keras import layers
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt

# 1. Create Neural Network
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))
network.add(layers.Dense(10, activation='softmax'))
network.compile(optimizer='rmsprop', loss='categorical_crossentropy',metrics=['accuracy'])

# 2. Prepare input data for neural network
try:
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()
except Exception as e:
    print("Exception in load_data():",e)
train_images = train_images.reshape ((60000, 28*28))
test_images = test_images.reshape ((10000, 28*28))
train_images = train_images.astype ('float32') / 255
test_images = test_images.astype ('float32') / 255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels) 

# 3. Create training loop
try:
    network.fit(train_images, train_labels, epochs=5, batch_size=128)
except Exception as e:
    print("Exception: ", e)

# 4. Plot the digit
digit = train_images[4]
digit = digit.reshape((28, 28))  # Reshape the flattened array to 28x28
plt.imshow(digit, cmap=plt.cm.binary)
plt.show()

# 5. Run training data
> python mnist.py
Epoch 1/5
60000/60000 [==============================] - 9s - loss: 0.2524 - acc: 0.9273
Epoch 2/5
51328/60000 [========================>.....] - ETA: 1s - loss: 0.1035 - acc: 0.969

# 6. Run on testing data
>>> test_loss, test_acc = network.evaluate(test_images, test_labels)
>>> print('test_acc:', test_acc)
test_acc: 0.9785